11 research outputs found

    Gimli Encryption in 715.9 psec

    Get PDF
    We study the encryption latency of the Gimli cipher, which has recently been submitted to NIST’s Lightweight Cryptography competition. We develop two optimized hardware engines for the 24 round Gimli permutation, characterized by a total latency or 3 and 4 cycles, respectively, in a range of frequencies up to 4.5 GHz. Specifically, we utilize Intel’s 10 nm FinFET process to synthesize a critical path of 15 logic levels, supporting a depth-3 Gimli pipeline capable of computing the result of the Gimli permutation in frequencies up to 3.9 GHz. On the same process technology, a depth-4 pipeline employs a critical path of 12 logic levels and can compute the Gimli permutation in frequencies up to 4.5 GHz. Gimli demonstrates a total unrolled data path latency of 715.9 psec. Compared to our AES implementation, our fastest pipelined Gimli engine demonstrates 3.39 times smaller latency. When compared to the latency of the PRINCE lightweight block cipher, the pipelined Gimli latency is 1.7 times smaller. The paper suggests that the Gimli cipher, and our proposed optimized implementations have the potential to provide breakthrough performance for latency critical applications, in domains such as data storage, networking, IoT and gaming

    Cryptographic Constructions Supporting Implicit Data Integrity

    Get PDF
    We study a methodology for supporting data integrity called \lq implicit integrity\rq  \> and present cryptographic constructions supporting it. Implicit integrity allows for corruption detection without producing, storing or verifying mathematical summaries of the content such as MACs and ICVs, or any other type of message expansion. As with authenticated encryption, the main idea behind this methodology is that, whereas typical user data demonstrate patterns such as repeated bytes or words, decrypted data resulting from corrupted ciphertexts no longer demonstrate such patterns. Thus, by checking the entropy of some decrypted ciphertexts, corruption can be possibly detected. The main contribution of this paper is a notion of security which is associated with implicit integrity, and which is different from the typical requirement that the output of cryptographic systems should be indistinguishable from the output of a random permutation. The notion of security we discuss reflects the fact that it should be computationally difficult for an adversary to corrupt some ciphertext so that the resulting plaintext demonstrates specific patterns. We introduce two kinds of adversaries. First, an input perturbing adversary performs content corruption attacks. Second an oracle replacing adversary performs content replay attacks. We discuss requirements for supporting implicit integrity in these two adversary models, and provide security bounds for a construction called IVP, a three-level confusion diffusion network which can support implicit integrity and is inexpensive to implement

    K-Cipher: A Low Latency, Bit Length Parameterizable Cipher

    Get PDF
    We present the design of a novel low latency, bit length parameterizable cipher, called the ``K-Cipher\u27\u27. K-Cipher is particularly useful to applications that need to support ultra low latency encryption at arbitrary ciphertext lengths. We can think of a range of networking, gaming and computing applications that may require encrypting data at unusual block lengths for many different reasons, such as to make space for other unencrypted state values. Furthermore, in modern applications, encryption is typically required to complete inside stringent time frames in order not to affect performance. K-Cipher has been designed to meet these requirements. In the paper we present the K-Cipher design and specification and discuss its security properties. Our analysis indicates that K-Cipher is secure against both known ciphertext, as well as adaptive chosen plaintext adversaries. Finally, we present synthesis results of 32-bit and 64-bit K-Cipher encrypt datapaths. Our results show that the encrypt datapaths can complete in no more than 767 psec, or 3 clocks in 3.9-4.9 GHz frequencies, and are associated with a maximum area requirement of 1875 um^2

    The MAGIC Mode for Simultaneously Supporting Encryption, Message Authentication and Error Correction

    Get PDF
    We present MAGIC, a mode for authenticated encryption that simultaneously supports encryption, message authentication and error correction, all with the same code. In MAGIC, the same code employed for cryptographic integrity is also the parity used for error correction. To correct errors, MAGIC employs the Galois Hash transformation, which due to its bit linearity can perform corrections in a similar way as other codes do (e.g., Reed Solomon). To provide a cryptographically strong MAC, MAGIC encrypts the output of the Galois Hash using a secret key. To analyze the security of this construction we adapt the definition of the MAC adversary so that it is applicable to systems that combine message authentication with error correction. We demonstrate that MAGIC offers security in the order of O(2 to the N/2) with N being the tag size

    Test and Debug Solutions for 3D-Stacked Integrated Circuits

    No full text
    <p>Three-dimensional (3D) stacking using through-silicon vias (TSVs) promises higher integration levels in a single package, keeping pace with Moore's law. TSVs are small copper or tungsten vias that go vertically through the substrate of a die and provide vertical interconnects to a die stacked on top. TSV-based interconnects have benefits in terms of performance, interconnect density, and power efficiency.</p><p>Testing has been identified as a showstopper for volume manufacturing of 3D-stacked integrated circuits (3D ICs). A number of challenges associated with 3D test need to be addressed before 3D ICs can become economically viable. This dissertation provides solutions to new challenges related to 3D test content, test access, diagnosis and debug.</p><p>Test content specific to 3D ICs targets defect that occur during TSV manufacturing and stacking process. One example is the effect of thermo-mechanical stress due to TSV fabrication process on the surrounding logic gates. In this dissertation, we analyze these effects and their consequences for delay testing. We provide quantitative results showing that the use of TSV-stress oblivious circuit models for test generation leads to considerable reduction in delay-test quality. We propose a test flow that uses TSV-stress aware circuit models to improve test quality.</p><p>Another example of 3D-specific test challenge is the testability of TSVs. In this dissertation, we focus on TSV test prior to die bonding, as access to TSVs is limited at this stage. We propose a non-invasive method for pre-bond TSV test that does not require TSV probing. The method uses ring oscillators and duty-cycle detectors in order to detect variations in propagation delay of gates connected to a single-sided TSV. Based on the measured variations, we can diagnose the TSV and predict the size of resistive-open and leakage faults using a regression model based on artificial neural networks. In addition, we exploit different voltage levels to increase the robustness of the test method.</p><p>In order to efficiently deliver test content to structures under test in a 3D stack, 3D design-for-test (DfT) architectures are needed. In this dissertation, we discuss existing 3D-DfT architectures and their optimization. We propose an optimization approach that takes uncertainties in input parameters into account and provides a solution that is efficient in the presence of input-parameter variations and minimizes test time, therefore reducing test cost.</p><p>Post-silicon debug is a major challenge due to continuously increasing design complexity. Traditional debug methods using signal tracing suffer from the limited capacity of on-chip trace buffers that only allow for signal observation during a short time window. This dissertation proposes a low-cost debug architecture for massive signal tracing in 3D-stacked ICs with wide-I/O DRAM dies. The key idea is to use available on-chip DRAM for trace-data storage, which results in a significant increase of the observation window compared to traditional methods that use trace buffers. In addition, the proposed on-chip debug circuitry can identify erroneous segments of observed data by using compact signatures that are stored in the DRAM a priori. Only failing intervals are off-loaded from a temporary trace buffer into DRAM, allowing for a more efficient use of the memory, resulting in a larger observation window.</p><p>In summary, this dissertation provides solutions to several challenges related to 3D test and debug that need to be addressed before volume manufacturing of 3D ICs can be viable.</p>Dissertatio

    Controlled toggle rate of non-test signals during modular scan testing of an integrated circuit

    No full text
    \u3cp\u3eA method is provided to test a modular integrated circuit (IC) comprising: testing a module-under-test (MUT) within the IC while causing a controlled toggle rate within a first neighbor module of the MUT; wherein the controlled toggle rate within the first neighbor module is selected so that toggling within the first neighbor module has substantially the same effect upon operation of the MUT that operation of the first neighbor module would have during actual normal functional operation of the first neighbor module.\u3c/p\u3

    Controlled toggle rate of non-test signals during modular scan testing of an integrated circuit

    No full text
    \u3cp\u3eA method is provided to test a modular integrated circuit (IC) 100 comprising: testing a module-under-test (MUT) 101B within the IC 100 while causing a controlled toggle rate within a first neighbor module 101A of the MUT 101B; wherein the controlled toggle rate within the first neighbor module 101A is selected so that toggling within the first neighbor module 101A has substantially the same effect upon operation of the MUT 101B as operation of the first neighbor module 101A would have during actual normal functional operation of the first neighbor module 101A.\u3c/p\u3

    Optimization of test‐access architectures and test scheduling for 3D ICs

    No full text
    This chapter presents a method for robust optimization of 3D test architecture and test scheduling in the presence of input parameter variations. It lists examples of uncertainties in input parameters for 3D test architecture optimization and test scheduling. The chapter then formulates an integer linear programming (ILP) model for robust optimization of 3D test architecture. A recent work has formulated a mathematical model for robust optimization of 3D test architecture and test scheduling and proposed a heuristic based on simulated annealing in order to solve the robust optimization problem for realistic 3D‐ICs. This chapter presents simulation results to evaluate the proposed heuristic method for robust optimization. The framework is implemented in C++. The chapter demonstrates the effect of robust optimization using a simple example. It also shows simulation results obtained with publicly available benchmarks

    3D design‐for‐test architecture

    No full text
    IMEC and Cadence have jointly developed a 3D design‐for‐test (DfT) architecture that serves both 2.5D and 3D stacked integrated circuits (SICs). The architecture originally targeted stacks of monolithic, non‐hierarchical, logic‐only dies. A 3D‐DfT demonstrator circuit was designed, manufactured, and tested as part of an IMEC 3D chip stack nicknamed “Vesuvius‐3D.” Over time, our architecture has been extended to include (i) multi‐tower stacks, hierarchical system on chips (SoCs) containing (ii) test data compression and (iii) embedded cores, (iv) allow for at‐speed interconnect testing, and (v) cover memory‐on‐logic stacks
    corecore